ticket #3 cs-assistant: Added :stats and verbose mode to the dev CLI#10
ticket #3 cs-assistant: Added :stats and verbose mode to the dev CLI#10strictshot wants to merge 1 commit into
Conversation
…fixed some reformatting
| log = get_logger(__name__) | ||
|
|
||
|
|
||
| async def _check_db() -> None: |
There was a problem hiding this comment.
This function could be refactored a bit. Since both startup and the :stats command are meant to do the same thing (print number of sources and chunks, warn if DB is empty), replace this with a helper that does both.
Call it something like _print_db_status().
Get the counts in this function the way you do in the stats command and print them:
count_sources, count_chunks = await Repository.get_source_and_chunk_counts(session)
From that query, you already get the count of chunks so you can do a simple if count_chunks == 0: check to see if the warning needs to be printed, instead of using the Repository.has_chunks method which makes a second db query even though we already have the info we need (number of chunks) from the first query. At our scale the second db query isn't really expensive, but still unnecessary to make since we need the counts regardless and that tells us if the count is 0 or not, so I think this way would be cleaner.
I think this might leave the has_chunks method with no uses, but don't remove it since it might be worth using somewhere else in the future. It can be cleaned up later if there's really no use for it.
|
|
||
| # :stats cmd | ||
| if question.lower() in {":stats"}: | ||
| async with async_session_factory() as session: |
There was a problem hiding this comment.
Once the shared helper based on the comment on line 12 has been implemented, this whole section (lines 37-40) could be replaced with the helper.
Same benefit, avoids making 2 queries when just one would be enough.
| return result.scalar_one_or_none() is not None | ||
|
|
||
| @staticmethod | ||
| async def get_source_and_chunk_counts(session: AsyncSession) -> tuple[int, int]: |
There was a problem hiding this comment.
These methods currently don't have tests, even though all other repository methods have tests implemented. For consistency, implement unit tests for these in tests/infrastructure/db/test_repository.py.
Follow the pattern / conventions of the other tests in that file. Run tests before committing to verify behaviour and that they pass.
| @staticmethod | ||
| async def count_chunks(session: AsyncSession) -> int: | ||
| result = await session.execute(select(func.count(ChunkRow.id))) | ||
| # if above doesn't work properly |
There was a problem hiding this comment.
Lines 26-27 look like temporary code that isn't needed since they're commented out? If yes, please remove them from the final PR.
| While inside the interactive CLI (`ask>`), you can use the following commands to control the session and view metadata: | ||
| | Command | Description | | ||
| | :--- | :--- | | ||
| | `:stats` | Toggles the display of performance metrics (e.g., token count, response time) for subsequent prompts. | |
There was a problem hiding this comment.
These descriptions don't fully match what those CLI commands are doing (e.g. stats doesn't show response time, verbose doesn't show "API interactions"). Update the descriptions so it's more clear exactly what the commands do, something like
:stats - Prints how many sources and chunks are currently loaded in the database.
:verbose - Toggles verbose mode: for each question, prints the retrieved chunks (source URL, similarity score, and a content snippet) before the answer.
| source_url = chunk_item.chunk.source_url | ||
| similarity_score = chunk_item.score | ||
| snippet = " ".join( | ||
| (chunk_item.chunk.content.split())[:250] |
There was a problem hiding this comment.
Snippet should infact be 250 characters, not words. Just replace with snippet = chunk_item.chunk.content[:250] and remove the comment below
| @@ -19,13 +20,33 @@ | |||
|
|
|||
| async def _repl() -> None: | |||
| await _check_db() | |||
There was a problem hiding this comment.
Dev CLI startup currently doesn't print the source and chunk counts, which is expected behaviour based on ticket description. Once the shared helper from the comment on line 12 has been implemented, that can be used here to both print the counts and an empty DB warning if needed.
Added two dev conveniences (:stats and :verbose).
Key changes to files :
Using Itsection)Manual verification done:
make migrate && make ingestto populate the local database.make cliand verified the startup message.:verbosemode on, submitted a query, and confirmed that the source URL, similarity scores, and character-limited snippets printed cleanly prior to the model's response.exitandquitstill terminate the session cleanly.